Implementation of a Distributed Fault-Tolerant NoC-based Architecture for the Single-Event Upset Detector
نویسندگان
چکیده
Today, with the rise of the private sector in space exploration, space missions are becoming more frequent than before. This in relation to the fact that modern electronics scale both faster and denser, the effects of radiation become a critical design requirement for fault-tolerance in on-board space computer systems. Radiation damage can be separated into two categories, Total Ionizing Effects (TID) and Single Event Effects (SEE). Different approaches exist in different design levels when facing the radiation hostile environment of space. Most commonly space on-board electronics use radiation-hardened components, however, this solution holds back the possibilities of space exploration, as usually these components are two or three generations older and a few orders of magnitude more expensive. Space electronics are in need of a fault-tolerant architecture that can leverage the high performance and the low cost of commercial off-the-shelf (COTS) components since SEEs and TID can limit the lifespan of a mission. To investigate a Fault-Tolerant COTS based in-house solution, that can detect and mitigate SEUs, the Single-Event Upset Detector (SEUD) project was proposed by the department of Electronic Systems, at the Royal Institute of Technology (KTH), and will be hosted by the KTH MInature STudent (MIST) satellite. The hypothesis is based on the fact that modern, faster, COTS Field-Programmable-Gate-Arrays (FPGA) are highly susceptible to SEUs due to their SRAM-based physical design but provide advanced mitigation techniques such as partial reconfigurability (i.e. Artix-7) while other FPGAs are FLASH-based and offer SEU immune configuration memory (i.e. SmartFusion2) in trade-off to slower operating frequencies. The proposed architecture is composed of two FPGA devices connected together through an in-house, off-chip distributed, Network-On-Chip (NoC) solution. The SRAM-based FPGA will act as the proof of concept platform where in-house developed SEU mitigation techniques will be evaluated, while the flash-based FPGA will act as the supervisor of the experiment as well as handle the communication link with the On-Board Computer (OBC) of MIST. The architecture features TMR protected flash configuration memories as well as two COTS SDRAM memories connected to each FPGA. The real case scenario which the fault-tolerant architecture will be evaluated on, is the image acquisition from a hosted camera, the storage and compression of the image and finally its transmission to the OBC. This Thesis aims to contribute to the SEUD experiment by investigating three crucial features, the implementation of a novel SEU mitigation technique for COTS Synchronous Dynamic Access Memory (SDRAM) devices using a prototype ErrorDetection-And-Correction (EDAC) controller, the design and implementation of a prototype fault-tolerant communication bridge between the two FPGAs and finally the implementation of a 2x3 Mesh Nostrum Network-On-Chip (NoC) solution distributed over two physically separate FPGA chips. Eleftherios Kyriakakis MSc Thesis Report November 30, 2017
منابع مشابه
A Comparative Study of VHDL Implementation of FT-2D-cGA and FT-3D-cGA on Different Benchmarks (RESEARCH NOTE)
This paper presents the VHDL implementation of fault tolerant cellular genetic algorithm. The goal of paper is to harden the hardware implementation of the cGA against single error upset (SEU), when affecting the fitness registers in the target hardware. The proposed approach, consists of two phases; Error monitoring and error recovery. Using innovative connectivity between processing elements ...
متن کاملInvestigation of transient fault effects in synchronous and asynchronous Network on Chip router
Please cite this article in press as: P.M. Yaghini Syst. Architect. (2010), doi:10.1016/j.sysarc.201 This paper presents comparison of transient fault effects in an asynchronous NoC router and a synchronous one. The experiment is based on simulation-based fault injection method to assess the fault-tolerant behavior of both architectures. The effort has been accomplished by employing fault injec...
متن کاملInvestigating SNNs for Identifying and Classifying Faults in Networks - on - Chip Computing Systems
Brief Project Description: Fault-free design of electronic systems is becoming increasingly difficult due to variations in the silicon manufacturing process, requiring systems to be adaptive to faulty conditions post deployment. Researchers have looked to building brain-inspired computing architectures, based on Spiking Neural Networks (SNN), which aim to mimic the efficient and self-adaptive i...
متن کاملSEUs Mitigation on Program Counter of the LEON3 Soft Processor
Analyzing and evaluating the sensitivity of embedded systems to soft-errors have always been a challenge for aerospace or safety equipment designer. Different automated fault-injection methods have been developed for evaluating the sensitivity of integrated circuit. Also many techniques have been developed to get a fault tolerant architecture in order to mask and mitigate fault injection in a c...
متن کاملNew Fault Tolerant Design Methodology Applied to Middleware Switch Processor
In this paper is presented a new fault tolerant design methodology which provides protection against three most important radiation effects – single event transients (SET), single event upsets (SEU) and single event latchup (SEL). SETs and SEUs are mitigated using the hardware redundancy. Protection against SEL effects is provided by specially designed SEL power protection cell. Combination of ...
متن کامل